Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 51
Filtrar
1.
G3 (Bethesda) ; 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38427916

RESUMEN

Tanoak (Notholithocarpus densiflorus) is an evergreen tree in the Fagaceae family found in California and southern Oregon. Historically, tanoak acorns were an important food source for Native American tribes and the bark was used extensively in the leather tanning process. Long considered a disjunct relictual element of the Asian stone oaks (Lithocarpus spp.), phylogenetic analysis has determined that the tanoak is an example of convergent evolution. Tanoaks are deeply divergent from oaks (Quercus) of the Pacific Northwest and comprise a new genus with a single species. These trees are highly susceptible to 'sudden oak death' (SOD), a plant pathogen (Phytophthora ramorum) that has caused widespread mortality of tanoaks. Here, we set out to assemble the genome and perform comparative studies among a number of individuals that demonstrated varying levels of susceptibility to SOD. First, we sequenced and de novo assembled a draft reference genome of N. densiflorus using co-barcoded library processing methods and an MGI DNBSEQ-G400 sequencer. To increase the contiguity of the final assembly, we also sequenced Oxford Nanopore (ONT) long reads to 30X coverage. To our knowledge, the draft genome reported here is one of the more contiguous and complete genomes of a tree species published to date, with a contig N50 of ∼1.2 Mb, a scaffold N50 of ∼2.1 Mb, and a complete gene score of 95.5% through BUSCO analysis. In addition, we sequenced 11 genetically distinct individuals and mapped these onto the draft reference genome enabling the discovery of almost 25 million single nucleotide polymorphisms and ∼4.4 million small insertions and deletions. Finally, using co-barcoded data we were able to generate complete haplotype coverage of all 11 genomes.

2.
Cell Rep Methods ; 3(3): 100437, 2023 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-37056375

RESUMEN

Sequencing of hypervariable regions as well as internal transcribed spacer regions of ribosomal RNA genes (rDNA) is broadly used to identify bacteria and fungi, but taxonomic and phylogenetic resolution is hampered by insufficient sequencing length using high throughput, cost-efficient second-generation sequencing. We developed a method to obtain nearly full-length rDNA by assembling single DNA molecules combining DNA co-barcoding with single-tube long fragment read technology and second-generation sequencing. Benchmarking was performed using mock bacterial and fungal communities as well as two forest soil samples. All mock species rDNA were successfully recovered with identities above 99.5% compared to the reference sequences. From the soil samples we obtained good coverage with identification of more than 20,000 unknown species, as well as high abundance correlation between replicates. This approach provides a cost-effective method for obtaining extensive and accurate information on complex environmental microbial communities.


Asunto(s)
Eucariontes , Microbiota , Filogenia , Eucariontes/genética , Genes de ARNr , Análisis de Secuencia de ADN/métodos , ARN Ribosómico/genética , Bacterias/genética , Microbiota/genética , ADN Ribosómico/genética , Suelo
3.
Methods Mol Biol ; 2590: 59-70, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36335492

RESUMEN

In this chapter, we describe a simple, low-cost method for making many copies of a single DNA molecule (1-10 kb in length) as a concatemer on a long DNA strand. This can enable applications requiring high-quality contiguous sequence and haplotype data from long single DNA molecules at large scale.


Asunto(s)
ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Haplotipos/genética , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ADN/genética
4.
Methods Mol Biol ; 2590: 101-125, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36335495

RESUMEN

In this chapter, we describe single-tube long fragment read (stLFR), a simple preparation method for whole-genome sequencing and physical haplotyping based on the DNA co-barcoding strategy. Similar to LFR, stLFR applies the concept of adding the same barcode to subfragments derived from the same long DNA molecule. However, instead of a 384-well plate, stLFR uses the surface of micron-sized magnetic beads to create millions of virtual compartments in a single reaction tube. This is enabled by a split and pool barcoded bead preparation process capable of generating ~500,000 copies of the same unique barcode, from a library of ~3.6 billion unique barcodes, on each bead. The instruments and devices used in the stLFR process are easily accessible in nearly all standard molecular biology laboratories, and the cost of reagents can be as low as 30 dollars per sample. stLFR libraries can be sequenced by standard second-generation sequencing instruments (e.g., MGI or Illumina devices), and the barcode sharing information enables detection and phasing of all variations, including large structural variations. In addition, stLFR data can be used to scaffold contigs and de novo assemble genomes.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Análisis Costo-Beneficio , Haplotipos , Secuenciación Completa del Genoma , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Biblioteca de Genes , Análisis de Secuencia de ADN
5.
Enzyme Microb Technol ; 150: 109878, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34489031

RESUMEN

In this article we describe a sensitive exonuclease III detection method using a DNB nanoarray from a BGISEQ-500 sequencing kit and demonstrate a detection limit as low as 0.001 U/mL. The flow cell of the sequencing kit was loaded with billions of DNA nanoballs (DNBs) to form the DNB nanoarray and initially used for massively parallel sequencing. The 3'-end recessed dsDNA structure formed by sequencing was shown to be a perfect substrate for exonuclease III, but not for other nucleases such as exonuclease I, RecJf and nuclease P1. We developed an exonuclease III assay using the DNB nanoarray, together with other reagents within the BGISEQ-500 sequencing kit, which only required one additional cycle of sequencing. The DNB nanoarray can be reused for the exonuclease III assay at least five times. This method demonstrated superior sensitivity, selectivity, and reusability compared with other assay methods and is accompanied by low cost and simple setup.


Asunto(s)
ADN , Tecnología , Exodesoxirribonucleasas
6.
Nucleic Acids Res ; 49(2): e10, 2021 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-33290507

RESUMEN

Results of massive parallel sequencing-by-synthesis vary depending on the sequencing approach. CoolMPS™ is a new sequencing chemistry that incorporates bases by labeled antibodies. To evaluate the performance, we sequenced 240 human non-coding RNA samples (dementia patients and controls) with and without CoolMPS. The Q30 value as indicator of the per base sequencing quality increased from 91.8 to 94%. The higher quality was reached across the whole read length. Likewise, the percentage of reads mapping to the human genome increased from 84.9 to 86.2%. For both technologies, we computed similar distributions between different RNA classes (miRNA, piRNA, tRNA, snoRNA and yRNA) and within the classes. While standard sequencing-by-synthesis allowed to recover more annotated miRNAs, CoolMPS yielded more novel miRNAs. The correlation between the two methods was 0.97. Evaluating the diagnostic performance, we observed lower minimal P-values for CoolMPS (adjusted P-value of 0.0006 versus 0.0004) and larger effect sizes (Cohen's d of 0.878 versus 0.9). Validating 19 miRNAs resulted in a correlation of 0.852 between CoolMPS and reverse transcriptase-quantitative polymerase chain reaction. Comparison to data generated with Illumina technology confirmed a known shift in the overall RNA composition. With CoolMPS we evaluated a novel sequencing-by-synthesis technology showing high performance for the analysis of non-coding RNAs.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN no Traducido/química , Análisis de Secuencia de ARN/métodos , Especificidad de Anticuerpos , Biomarcadores , Biología Computacional , ADN Complementario/genética , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Demencia/sangre , Demencia/genética , Técnica del Anticuerpo Fluorescente Directa , Biblioteca de Genes , Humanos , Biopsia Líquida , MicroARNs/química , MicroARNs/genética , Nucleótidos/inmunología , ARN no Traducido/síntesis química , ARN no Traducido/genética , Reproducibilidad de los Resultados , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
7.
Gigascience ; 9(12)2020 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-33347571

RESUMEN

BACKGROUND: Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS: Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS: The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.


Asunto(s)
Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento , Genoma de Planta , Análisis de Secuencia de ADN , Programas Informáticos
8.
Sci Rep ; 10(1): 18863, 2020 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-33139759

RESUMEN

Recent studies show that non-coding RNAs (ncRNAs) can regulate the expression of protein-coding genes and play important roles in mammalian development. Previous studies have revealed that during C. elegans (Caenorhabditis elegans) embryo development, numerous genes in each cell are spatiotemporally regulated, causing the cell to differentiate into distinct cell types and tissues. We ask whether ncRNAs participate in the spatiotemporal regulation of genes in different types of cells and tissues during the embryogenesis of C. elegans. Here, by using marker-free full-length high-depth single-cell RNA sequencing (scRNA-seq) technique, we sequence the whole transcriptomes from 1031 embryonic cells of C. elegans and detect 20,431 protein-coding genes, including 22 cell-type-specific protein-coding markers, and 9843 ncRNAs including 11 cell-type-specific ncRNA markers. We induce a ncRNAs-based clustering strategy as a complementary strategy to the protein-coding gene-based clustering strategy for single-cell classification. We identify 94 ncRNAs that have never been reported to regulate gene expressions, are co-expressed with 1208 protein-coding genes in cell type specific and/or embryo time specific manners. Our findings suggest that these ncRNAs could potentially influence the spatiotemporal expression of the corresponding genes during the embryogenesis of C. elegans.


Asunto(s)
Caenorhabditis elegans/genética , Desarrollo Embrionario/genética , ARN no Traducido/genética , Transcriptoma/genética , Animales , Caenorhabditis elegans/crecimiento & desarrollo , Proteínas de Caenorhabditis elegans/genética , Regulación del Desarrollo de la Expresión Génica/genética , ARN no Traducido/clasificación , Análisis de la Célula Individual
9.
PeerJ ; 8: e8431, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32231869

RESUMEN

Recent advances in long fragment read (LFR, also known as linked-read technologies or read-cloud) technologies, such as single tube long fragment reads (stLFR), 10X Genomics Chromium reads, and TruSeq synthetic long-reads, have enabled efficient haplotyping and genome assembly. However, in the case of stLFR and 10X Genomics Chromium reads, the long fragments of a genome are covered sparsely by reads in each barcode and most barcodes are contained in multiple long fragments from different regions, which results in inefficient assembly when using long-range information. Thus, methods to address these shortcomings are vital for capitalizing on the additional information obtained using these technologies. We therefore designed IterCluster, a novel, alignment-free clustering algorithm that can cluster barcodes from the same target region of a genome, using -mer frequency-based features and a Markov Cluster (MCL) approach to identify enough reads in a target region of a genome to ensure sufficient target genome sequence depth. The IterCluster method was validated using BGI stLFR and 10X Genomics chromium reads datasets. IterCluster had a higher precision and recall rate on BGI stLFR data compared to 10X Genomics Chromium read data. In addition, we demonstrated how IterCluster improves the de novo assembly results when using a divide-and-conquer strategy on a human genome data set (scaffold/contig N50 = 13.2 kbp/7.1 kbp vs. 17.1 kbp/11.9 kbp before and after IterCluster, respectively). IterCluster provides a new way for determining LFR barcode enrichment and a novel approach for de novo assembly using LFR data. IterCluster is OpenSource and available on https://github.com/JianCong-WENG/IterCluster.

10.
GigaByte ; 2020: gigabyte4, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-36824597

RESUMEN

Nyssa yunnanensis is a deciduous tree species in the family Nyssaceae within the order Cornales. As only eight individual trees and two populations have been recorded in China's Yunnan province, this species has been listed among China's national Class I protection species since 1999 and also among 120 PSESP (Plant Species with Extremely Small Populations) in the Implementation Plan of Rescuing and Conserving China's Plant Species with Extremely Small Populations (PSESP) (2011-2-15). Here, we present the draft genome assembly of N. yunnanensis. Using 10X Genomics linked-reads sequencing data, we carried out the de novo assembly and annotation analysis. The N. yunnanensis genome assembly is 1475 Mb in length, containing 288,519 scaffolds with a scaffold N50 length of 985.59 kb. Within the assembled genome, 799.51 Mb was identified as repetitive elements, accounting for 54.24% of the sequenced genome, and a total of 39,803 protein-coding genes were predicted. With the genomic characteristics of N. yunnanensis available, our study might facilitate future conservation biology studies to help protect this extremely threatened tree species.

11.
BMC Genomics ; 20(1): 604, 2019 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-31337347

RESUMEN

BACKGROUND: RNA-Seq data is inherently nonuniform for different transcripts because of differences in gene expression. This makes it challenging to decide how much data should be generated from each sample. How much should one spend to recover the less expressed transcripts? The sequencing technology used is another consideration, as there are inevitably always biases against certain sequences. To investigate these effects, we first looked at high-depth libraries from a set of well-annotated organisms to ascertain the impact of sequencing depth on de novo assembly. We then looked at libraries sequenced from the Universal Human Reference RNA (UHRR) to compare the performance of Illumina HiSeq and MGI DNBseq™ technologies. RESULTS: On the issue of sequencing depth, the amount of exomic sequence assembled plateaued using data sets of approximately 2 to 8 Gbp. However, the amount of genomic sequence assembled did not plateau for many of the analyzed organisms. Most of the unannotated genomic sequences are single-exon transcripts whose biological significance will be questionable for some users. On the issue of sequencing technology, both of the analyzed platforms recovered a similar number of full-length transcripts. The missing "gap" regions in the HiSeq assemblies were often attributed to higher GC contents, but this may be an artefact of library preparation and not of sequencing technology. CONCLUSIONS: Increasing sequencing depth beyond modest data sets of less than 10 Gbp recovers a plethora of single-exon transcripts undocumented in genome annotations. DNBseq™ is a viable alternative to HiSeq for de novo RNA-Seq assembly.


Asunto(s)
RNA-Seq/métodos , Animales , Arabidopsis , Exones , Biblioteca de Genes , Humanos , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Oryza
12.
DNA Res ; 26(4): 313-325, 2019 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-31173071

RESUMEN

The diversity of disease presentations warrants one single assay for detection and delineation of various genomic disorders. Herein, we describe a gel-free and biotin-capture-free mate-pair method through coupling Controlled Polymerizations by Adapter-Ligation (CP-AL). We first demonstrated the feasibility and ease-of-use in monitoring DNA nick translation and primer extension by limiting the nucleotide input. By coupling these two controlled polymerizations by a reported non-conventional adapter-ligation reaction 3' branch ligation, we evidenced that CP-AL significantly increased DNA circularization efficiency (by 4-fold) and was applicable for different sequencing methods but at a faction of current cost. Its advantages were further demonstrated by fully elimination of small-insert-contaminated (by 39.3-fold) with a ∼50% increment of physical coverage, and producing uniform genome/exome coverage and the lowest chimeric rate. It achieved single-nucleotide variants detection with sensitivity and specificity up to 97.3 and 99.7%, respectively, compared with data from small-insert libraries. In addition, this method can provide a comprehensive delineation of structural rearrangements, evidenced by a potential diagnosis in a patient with oligo-atheno-terato-spermia. Moreover, it enables accurate mutation identification by integration of genomic variants from different aberration types. Overall, it provides a potential single-integrated solution for detecting various genomic variants, facilitating a genetic diagnosis in human diseases.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Técnicas de Genotipaje/métodos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Predisposición Genética a la Enfermedad , Humanos , Infertilidad Masculina/genética , Masculino
13.
Genome Res ; 29(5): 798-808, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30940689

RESUMEN

Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process, up to 3.6 billion unique barcode sequences were generated on beads, enabling practically nonredundant cobarcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique cobarcoding of more than 8 million 20- to 300-kb genomic DNA fragments. Analysis of the human genome NA12878 with stLFR demonstrated high-quality variant calling and phase block lengths up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries, and their construction did not significantly add to the time or cost of whole-genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación Completa del Genoma/métodos , Análisis Costo-Beneficio , Diploidia , Biblioteca de Genes , Genoma Humano , Genómica , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Humanos , Secuenciación Completa del Genoma/economía
14.
BMC Genomics ; 20(1): 215, 2019 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-30866797

RESUMEN

BACKGROUND: Massively-parallel-sequencing, coupled with sample multiplexing, has made genetic tests broadly affordable. However, intractable index mis-assignments (commonly exceeds 1%) were repeatedly reported on some widely used sequencing platforms. RESULTS: Here, we investigated this quality issue on BGI sequencers using three library preparation methods: whole genome sequencing (WGS) with PCR, PCR-free WGS, and two-step targeted PCR. BGI's sequencers utilize a unique DNA nanoball (DNB) technology which uses rolling circle replication for DNA-nanoball preparation; this linear amplification is PCR free and can avoid error accumulation. We demonstrated that single index mis-assignment from free indexed oligos occurs at a rate of one in 36 million reads, suggesting virtually no index hopping during DNB creation and arraying. Furthermore, the DNB-based NGS libraries have achieved an unprecedentedly low sample-to-sample mis-assignment rate of 0.0001 to 0.0004% under recommended procedures. CONCLUSIONS: Single indexing with DNB technology provides a simple but effective method for sensitive genetic assays with large sample numbers.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Bacterias/genética , Humanos , Secuenciación Completa del Genoma , Flujo de Trabajo
15.
Nucleic Acids Res ; 47(6): 2981-2995, 2019 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-30698752

RESUMEN

To fully understand human genetic variation and its functional consequences, the specific distribution of variants between the two chromosomal homologues of genes must be known. The 'phase' of variants can significantly impact gene function and phenotype. To assess patterns of phase at large scale, we have analyzed 18 121 autosomal genes in 1092 statistically phased genomes from the 1000 Genomes Project and 184 experimentally phased genomes from the Personal Genome Project. Here we show that genes with cis-configurations of coding variants are more frequent than genes with trans-configurations in a genome, with global cis/trans ratios of ∼60:40. Significant cis-abundance was observed in virtually all genomes in all populations. Moreover, we identified a large group of genes exhibiting cis-configurations of protein-changing variants in excess, so-called 'cis-abundant genes', and a smaller group of 'trans-abundant genes'. These two gene categories were functionally distinguishable, and exhibited strikingly different distributional patterns of protein-changing variants. Underlying these phenomena was a shared set of phase-sensitive genes of importance for adaptation and evolution. This work establishes common patterns of phase as key characteristics of diploid human exomes and provides evidence for their functional significance, highlighting the importance of phase for the interpretation of protein-coding genetic variation and gene function.


Asunto(s)
Diploidia , Genoma Humano/genética , Sistemas de Lectura Abierta/genética , Sitios de Carácter Cuantitativo/genética , Exoma/genética , Variación Genética , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple/genética
16.
DNA Res ; 26(1): 45-53, 2019 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-30428014

RESUMEN

Nucleic acid ligases are crucial enzymes that repair breaks in DNA or RNA during synthesis, repair and recombination. Various genomic tools have been developed using the diverse activities of DNA/RNA ligases. Herein, we demonstrate a non-conventional ability of T4 DNA ligase to insert 5' phosphorylated blunt-end double-stranded DNA to DNA breaks at 3'-recessive ends, gaps, or nicks to form a Y-shaped 3'-branch structure. Therefore, this base pairing-independent ligation is termed 3'-branch ligation (3'BL). In an extensive study of optimal ligation conditions, the presence of 10% PEG-8000 in the ligation buffer significantly increased ligation efficiency to more than 80%. Ligation efficiency was slightly varied between different donor and acceptor sequences. More interestingly, we discovered that T4 DNA ligase efficiently ligated DNA to the 3'-recessed end of RNA, not to that of DNA, in a DNA/RNA hybrid, suggesting a ternary complex formation preference of T4 DNA ligase. These novel properties of T4 DNA ligase can be utilized as a broad molecular technique in many important genomic applications, such as 3'-end labelling by adding a universal sequence; directional tagmentation for NGS library construction that achieve theoretical 100% template usage; and targeted RNA NGS libraries with mitigated structure-based bias and adapter dimer problems.


Asunto(s)
ADN Ligasas/metabolismo , ADN/metabolismo , Ingeniería Genética/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN/metabolismo , Humanos
17.
Genet Med ; 20(5): 495-502, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29758565

RESUMEN

PurposeWe describe a novel syndrome in seven female patients with extreme developmental delay and neoteny.MethodsAll patients in this study were female, aged 4 to 23 years, were well below the fifth percentile in height and weight, had failed to develop sexually, and lacked the use of language. Karyotype and array chromosome genomic hybridization analysis failed to identify large-scale structural variations. To further understand the underlying cause of disease in these patients, whole-genome sequencing was performed.ResultsIn five patients, coding de novo mutations (DNMs) were found in five different genes. These genes fell into similar functional categories of transcription regulation and chromatin modification. Comparison to a control population suggested that individuals with neotenic complex syndrome (NCS)-a name that we propose herein-could have an excess of rare inherited variants in genes associated with developmental delay and autism, although the difference was not significant.ConclusionWe describe an extreme form of developmental delay, with the defining characteristic of neoteny. In most patients we identified coding DNMs in a set of genes intolerant of haploinsufficiency; however, it is not clear whether these contributed to NCS. Rare inherited variants may also be associated with NCS, but more samples need to be analyzed to achieve statistical significance.


Asunto(s)
Anomalías Múltiples/diagnóstico , Anomalías Múltiples/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Pruebas Genéticas , Fenotipo , Adolescente , Adulto , Alelos , Sustitución de Aminoácidos , Niño , Preescolar , Facies , Femenino , Frecuencia de los Genes , Pruebas Genéticas/métodos , Genotipo , Humanos , Masculino , Síndrome , Secuenciación Completa del Genoma , Adulto Joven
18.
Clin Chem ; 64(4): 715-725, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29545257

RESUMEN

BACKGROUND: Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. METHODS: cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. RESULTS: Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 (CHD8) and LDL receptor-related protein 1 (LRP1), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. CONCLUSIONS: We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures.


Asunto(s)
Líquido Amniótico/metabolismo , Feto/metabolismo , Genoma Humano , Secuenciación Completa del Genoma , Anomalías Múltiples/genética , Adulto , Amniocentesis , Trastorno del Espectro Autista/genética , Estudios de Cohortes , Variaciones en el Número de Copia de ADN , Enfermedad de Darier/genética , Cejas/anomalías , Estudios de Factibilidad , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Mutación
19.
Gigascience ; 7(3): 1-8, 2018 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-29293960

RESUMEN

Background: More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of 2 Illumina platforms. Findings: Using fecal samples from 20 healthy individuals, we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform vs the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the 2 Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high-quality reads (96.06% of raw reads) per sample, with 90.56% of bases scoring Q30 and above, was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02%-3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias toward genes with higher GC content being enriched on the HiSeq platforms. Conclusions: Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.


Asunto(s)
Bacterias/genética , Microbioma Gastrointestinal/genética , Metagenómica/métodos , Análisis de Secuencia de ADN/métodos , Bacterias/clasificación , Biología Computacional , Heces/microbiología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos
20.
Hum Genomics ; 11(1): 30, 2017 Dec 08.
Artículo en Inglés | MEDLINE | ID: mdl-29216901

RESUMEN

BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a devastating disease whose complex pathology has been associated with a strong genetic component in the context of both familial and sporadic disease. Herein, we adopted a next-generation sequencing approach to Greek patients suffering from sporadic ALS (together with their healthy counterparts) in order to explore further the genetic basis of sporadic ALS (sALS). RESULTS: Whole-genome sequencing analysis of Greek sALS patients revealed a positive association between FTO and TBC1D1 gene variants and sALS. Further, linkage disequilibrium analyses were suggestive of a specific disease-associated haplotype for FTO gene variants. Genotyping for these variants was performed in Greek, Sardinian, and Turkish sALS patients. A lack of association between FTO and TBC1D1 variants and sALS in patients of Sardinian and Turkish descent may suggest a founder effect in the Greek population. FTO was found to be highly expressed in motor neurons, while in silico analyses predicted an impact on FTO and TBC1D1 mRNA splicing for the genomic variants in question. CONCLUSIONS: To our knowledge, this is the first study to present a possible association between FTO gene variants and the genetic etiology of sALS. In addition, the next-generation sequencing-based genomics approach coupled with the two-step validation strategy described herein has the potential to be applied to other types of human complex genetic disorders in order to identify variants of clinical significance.


Asunto(s)
Dioxigenasa FTO Dependiente de Alfa-Cetoglutarato/genética , Esclerosis Amiotrófica Lateral/genética , Dioxigenasa FTO Dependiente de Alfa-Cetoglutarato/metabolismo , Estudios de Casos y Controles , Simulación por Computador , Efecto Fundador , Proteínas Activadoras de GTPasa/genética , Grecia , Haplotipos , Humanos , Desequilibrio de Ligamiento , Neuronas Motoras/patología , Neuronas Motoras/fisiología , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...